BBN's Systems for the Chinese-English Sub-task of the NTCIR-9 PatentMT Evaluation

نویسندگان

  • Jeff Z. Ma
  • Spyridon Matsoukas
چکیده

This paper describes the work we conducted for building a statistical machine translation (SMT) system for the ChineseEnglish sub-task of the NTCIR-9 patent machine translation (MT) evaluation [17]. We first applied the various techniques on patent data that we had developed for improving SMT performance on other types of data. Our results show that most of the techniques work on patent document translation as well. Second we made changes to our SMT system training in order to address special characteristics of patent documents. The changes produced additional improvements.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

System Description of BJTU-NLP SMT for NTCIR-9 PatentMT

This paper presents the overview of statistical machine translation systems that BJTU-NLP developed for the NTCIR-9 Patent Machine Translation Task (NTCIR-9 PatentMT). We compared the performance between phrase-based translation model and factored translation model in our Patent SMT of Chinese to English and English to Japanese. Factored translation model was proposed as an extended phrase-base...

متن کامل

ZZX_MT: the BeiHang MT System for NTCIR-9 PatentMT Task

In this paper, we describe ZZX_MT machine translation system for the NTCIR-9 Patent Machine Translation Task(PatentMT). We participated in the Chinese-English translation subtask and submit three results, which correspond to three different models or decoding algorithms respectively. Both of the first two are phrase-based SMT approaches integrating the BTG constraint into reordering models, and...

متن کامل

Overview of the Patent Machine Translation Task at the NTCIR-9 Workshop

This paper gives an overview of the Patent Machine Translation Task (PatentMT) at NTCIR-9 by describing the test collection, evaluation methods, and evaluation results. We organized three patent machine translation subtasks: Chinese to English, Japanese to English, and English to Japanese. For these subtasks, we provided large-scale test collections, including training data, development data an...

متن کامل

An Improved Patent Machine Translation System Using Adaptive Enhancement for NTCIR-10 PatentMT Task

This paper describes the work that we conducted for the Chinese-English (CE) task of the NTCIR-10 patent machine translation evaluation. We built standard phrase-based and hierarchical phrase-based statistical machine translation (SMT) systems with optimized word segmentation, adaptive language model and improved parameter tuning strategy. Our systems outperform official baselines by approximat...

متن کامل

The RWTH Aachen System for NTCIR-9 PatentMT

This paper describes the statistical machine translation (SMT) systems developed by RWTH Aachen University for the Patent Translation task of the 9th NTCIR Workshop. Both phrase-based and hierarchical SMT systems were trained for the constrained JapaneseEnglish and Chinese-English tasks. Experiments were conducted to compare different training data sets, training methods and optimization criter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011